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We give an explicit procedure based on entangled input states for estimating a SU (d) operation 
U with rate of convergence when sending A*' particles through the device. We prove that this 

rate is optimal. We also evaluate the constant C such that the asymptotic risk is C/N^ . However 
other strategies might yield a better constant C . 
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I. INTRODUCTION 

The question that we are investigating in this paper is: 
"What is the best way of estimating a unitary operation 

ur 

By "unitary operation", we mean a device (or a chan- 
nel) that sends a density operator po on to another 
density operator p = U poU* , where U G SU{d), a special 
unitary matrix. 

We immediately stress that the solution to this estima- 
tion problem can be divided into two parts: what is the 
input state, and which measurement (POVM) to apply 
on the output state? Indeed, in order to estimate the 
channel U, we have to let it act on a state (the input 
state). And once we have the output state, the problem 
consists in discriminating states in the family of possible 
output states. 

This estimation of unitary operation has been exten- 
sively studied over the last few years. 

The first invitation was [l| , featuring numerous special 
cases. In most of those, the unitary U is known to belong 
to some subset of SU{2). 

Then [3| provided the form of an optimal state to be 
sent in with non-specified coefficients depending on the 
cost function (we give the formula of this state in equa- 
tion (|2.2p ). In that paper the authors consider the situ- 
ation where the unitary operation is performed indepen- 
dently on N systems. That study applied to any SU{d), 
and any covariant loss function, in particular fidelity, in 
a Bayesian framework. The proposed input state uses 
an ancilla, that is an auxiliary system that is not sent 
through the unitary channel with Hilbert space (C*)*^^. 
The state is prepared as a superposition of maximally 
entangled states, one for each irreducible representation 
of SU{d) appearing in (C^)**". We emphasize that the 
state is an entangled state of (C^)®^ «) (C'')'*^: we do 
not send N copies of an entangled state through the de- 
vice, but all the TV systems that are sent through the 
channel together with the N particles of the ancilla are 
part of the same entangled state, yielding the most gen- 
eral possible strategy. There was no evaluation of the 
rate of convergence, though. 

Subsequent works mainly focused on SU{2), as the 
case is simpler and yields many applications, e.g. trans- 
mission of reference frames in quantum communication. 
Indeed, the latter is equivalent to the estimation of a 



SU{2) operation. The first strategy to be proved to con- 
verge (in fidelity) at rate was not covariant [3]. It 
made no use of an ancilla. Later, the same rate was 
achieved for a covariant measurement with an ancilla Q 
through a judicious choice of the coefficients left free in 
the state proposed in The optimal constant (tt^/TV^ 
for the fidelity) was also computed. It was almost simul- 
taneously noticed 0, Q that asymptotically the ancilla is 
unnecessary. Indeed what we need is entangling differ- 
ent copies of the same irreducible representation. Now 
each irreducible representation appears with multiplicity 
in (C*)**^, most of them with higher multiplicity than 
dimension, which is the condition we need. This method 
was dubbed "self-entanglement". The advantage is that 
we need to prepare half the number of particles, as we 
do not need an ancilla. In all these articles, the Bayesian 
paradigm with uniform prior was used. The same 1/iV^ 
rate was shown to hold true in a minimax sense, in point- 
wise estimation We stress the importance of this 
rate, proving how useful entanglement can be. In- 
deed, in classical data analysis, we cannot expect a better 
rate than 1/N. Similarly the bound holds for any 
strategy where the N particles we send through the de- 
vice are not entangled "among themselves" (that is, even 
if there is an ancilla for each of these TV particles) . 

Another popular theme has been the determination of 
the phase for unitaries of the form = e^'^^ . This 
very special case already has many applications, espe- 
cially in interferometry or measurement of small forces, 
as featured in the review article [1| and references therein. 
A common feature of the most efficient techniques is the 
need for entangled states of many particles, and much 
experimental work has aimed at generating such states. 
These methods essentially involve either manipulation of 
photons obtained through parametric down-conversion 
(for example [§|), ions in ion traps (for example (lo| ) 
or atoms in cavity QED (for example |ll|). 

In recent years, there has been renewed interest in the 
SU{d) case. Notably, [H takes off from [2|, allowing for 
more general symmetries and making explicit for natural 
cost functions both the free coefficients - as the coordi- 
nates of the eigenvector of a matrix - and the POVM (see 
Theorem III. II below) . With a completely different strat- 
egy, aiming rather at pointwise estimation (and therefore 
minimax theorems), an input state for J7®" was found 
such that the Quantum Fisher Information ma- 
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trix is scaling like yielding hopes of getting as fast 

an estimator for SU{d). No associated measurement was 
found in that paper. 

Given the state of the art, a natural question is whether 
we can obtain, as for SU{2), this dramatic increase in 
performance when using entanglement for general SU{d). 
That is, do we have an estimation procedure whose rate 
is l/N^, instead of 1/N? Neither |12l | , w here the asymp- 
totics are not studied for SU{d), nor where no mea- 
surement is given, answer this question. 

In this article, we first prove that we cannot expect a 
better rate than 1/N'^. This kind of bound based on the 
laws of quantum physics, without any a priori on the 
experimental device, is traditionally called the Heisen- 
berg limit of the problem. Then we choose a completely 
explicit input state of the form l|2.2p (as in by spec- 
ifying the coefRcients. By using the associated POVM, 
the estimator of a unitary quantum operation U £ SU (d) 
converges at rate l/N"^ . The constant is not optimal, but 
is briefly studied at the end of the paper. We obtain these 
results with fidelity as a cost function, both in a Bayesian 
setting, with a uniform prior, and in a minimax setting. 
Notice that we shall not need an ancilla. 

The next section consists in formulating the problem 
and restating Theorem 2 of within our framework. 
Section [Ull then shows that it is impossible to converge 
at rate faster than 0{N~'^). In section [iVj we write a 
general formula for the risk of a strategy as described in 
Theorem lII.il and in section [V] we specify our estimators 
by choosing our coefficients in l|2.2p . We then prove that 
the risk of this estimator is 0{N~'^). The last section 
(|VH) consists in finding the precise asymptotic speed of 
our procedure, that is the constant C in CN^^. We finish 
by stating in Theorem I VI . 1 1 the results of the paper. 

II. DESCRIPTION OF THE PROBLEM 

We are given an unknown unitary operation U G 
SU{d) and must estimate it "as precisely as possible". We 
are allowed to let it act on N particles, so that we are 
discriminating between the possible J7®^. We shall work 
both with pointwise estimation (as preferred by mathe- 
maticians) and with a Bayes uniform prior (a favorite of 
physicists). 

Any estimation procedure can be described as follows 
(see Figure [J): the unitary channel acts as 

U^^ ® 1 : (C*)®^ ® /C (C^)®^ /C, 

on the space of the N systems together with a possible 
ancilla. The input state /o„ G M((C'')®"(8)/C„) is mapped 
into an output state on which we perform a measurement 
M whose result is the estimator U G SU{d). 

In order to evaluate the quality of an estimator U, we 
fix a cost function A{U,V). The global pointwise risk of 
the estimator is 

Rp{U)^ sup Ec/[A([/,C/)]. 

UeSU(d) 
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Figure 1: Most general estimation scheme of U when n copies 
are available at the same time, and using entanglement. 



The probability distribution of U depends on f , and we 
take expectation with respect to this probability distri- 
bution. 

On the other hand, the Bayes risk with uniform prior 

is: 

Rb{U) = [ Eu[A{U,U)]dpi{U). 

JSU(d) 

where ^ is the Haar measure on SU{d). 

As cost function, we choose the fidelity F (or rather 
1 — F), which for an element of SU(d) is defined as: 

A(.,^)^l-»^ 

, \xu{u-'u)? 

where xn is the character of the defining representation 
of SU{d), whose Young tableau consists in only one box. 
In other words, XaiU) = Tr([/). 

Before really addressing the problem, we make a few 
remarks on why this choice of distance is suitable for 
mathematical analysis. 

Firstly, this cost function is covariant, i.e. A{U, U) = 
A(lc.,C/-i?7). 

Secondly, a useful feature within the Bayesian frame- 
work is that A is of the form (|2.ip . as required in 
Theorem III. II Indeed we can rewrite A{U,U) as 1 — 
Xa{U~^U)x'^{U~^U) /d'^ . Now the conjugate of a char- 
acter is the character of the adjoint representation, the 
product of two characters is again the character of a pos- 
sibly reducible representation tt. This character is equal 
to the sum of the characters of the irreducible represen- 
tations appearing in the Clebsch-Gordan development of 
TT, in which all coefficients are non-negative. Therefore 
A = 1 — {J2x '^xX*)i) where > and A runs over all 
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irreducible representations of SU (d) . That is the condi- 
tion l|2.ip that we shall need for applying Theorem III.ll 
given at the end of the section. 

On the other hand, the theory of pointwise estima- 
tion deals usually with the variance of the estimated 
parameters when we use a smooth parameterization of 
SU{d). As we want to use the Quantum Cramer- Rao 
Bound p.4p . we need A to be quadratic in the parame- 
ters to the first order, and positive lower bounded for U 
outside a neighborhood of U. As A is covariant, it is suf- 
ficient to check this with U = lc<i . Now an example of a 
smooth parameterization in a neighborhood of the iden- 
tity is U{e) = expiY^^e^T^) where 9 G M'^'-i and the 
Ta are generators of the Lie algebra, so that Tr(TQ,) = 0. 
Now Tr[cxpe„ eM] = d+Ea Tr(T,)+O(||0|p), so 
that the trace minus d, and consequently A, is quadratic 
in 9 to the first order. 

As stated at the beginning of this section, we are work- 
ing with [7®^. The Clebsch-Gordan decomposition of 
the n-th tensor product representation is 

A:|A|=Ar 

acting on 0X:|a|=jv H^g)C-^(^), where = C^(^) is the 
representation space of A, M{X) is the multiplicity of A in 
the n-th tensor product representation, and I?(A) the di- 
mension of A. We refer to C^^^^ as the multiplicity space 
of A. We have indexed the irreducible representations of 
SU{d) by A = (Ai, . . . , A^), and written |A| = J2i=i ^i- 
Notice that this labelling of irreducible representations is 
redundant, but that if |A^| = |A^|, then A^ and A^ are 
equivalent (denoted A^ = A^) if and only if A^ = A^. 

The starting point of our argument will be the follow- 
ing reformulation of the results of with less gener- 
ality, and without the formula for the risk whose form is 
not adapted to our subsequent analysis: 

Theorem II. 1. Let U G SU{d) be a unitary oper- 
ation to be estimated, through its action on N particles. 
We may use entanglement and/or an ancilla. 

Then, for a uniform prior and any cost function of the 
form 

c{U,U)^ao-Y.a^^xl{U-'U), (2.1) 

A 

we can find as optimal input state a pure state of the form 

(\\ ^^^^ 

\^)= -^Y.\^h®\<f>h (2.2) 

A:|A|=W V2?(A) »=1 

with c(A) > 0, and the normalization condition, 

Y.c{Xf = l. (2.3) 

A 



Moreover is an orthonormal basis of 'hC' and |</)^) 
are orthonormal vectors of the multiplicity space, which 
may be augmented by an ancilla if necessary (see remark 
below on the dimensions) . 

The corresponding measurement is the covariant 
POVM with seed S = \ri){v\ given by: 

I 

\V)= \JV{\)Y,\^^)®\^^), (2.4) 

A|c(A)#0 «=1 

that is a POVM whose density with respect to the Haar 
measure is given by m(U) ~ U\ri){ri\U* with 



A|c(A)#0 »=1 

Remark: We use T>{X) orthonormal vectors in the mul- 
tiplicity space of A. This requires M{X) > T>{X). If this 
is not the case, we must increase the dimension of the 
multiplicity space by using an ancilla in C''. Then the 
action of U is U^^ (g) l^s whose Clebsch-Gordan decom- 
position is 0X||X|=Ar U-^ ® l^sMw- With big enough S, 
we have SA4{X) > 'D(X). Notice that an ancilla is not 
necessary if c(A) = for all A such that I'(A) > A^(A). 

Another remark is that, as defined, our POVM is not 
properly normalized: M{SU{d)) ^ 1, but is equal to the 
projection on the space spanned by the J7|^'). As this 
is the only subspace of importance, we can complete the 
POVM (through the seed, for example) ad libitum. 

Our estimator U is the result of the measurement with 
POVM defined by and input state of the form 
with specific c(A). Such an estimator is covariant, that is 
Pu{U) = pi^^ {U~^U), where pu is the probability distri- 
bution of U when we are estimating U. The cost function 
is also covariant, so that E,ij[A{U,U)] does not depend 
on U. This implies that the Bayesian risk and the point- 
wise risk coincide. With the second equality true for all 
U G SU{d), we have: 

R^b{U)^Rp{U)=Eu[A{U,U)]. (2.5) 

Theorem lll.ll states that there exists an optimal (Bayes 
uniform) estimator Uo of this form (corresponding to the 
optimal choice of c(A)), so that it obeys l|2.5|) . From this 
we first prove that no estimator whatsoever can have a 
better rate than 

III. WHY WE CANNOT EXPECT BETTER 
RATE THAN l/iV^ 

For proving this result, we need the Bayesian risk for 
priors tt other than the uniform prior: 

R^{U)^E^[Eu[AiU,U)]]. 
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As Uo is Bayesian optimal for the uniform prior, we 
only have to prove that Rb{Uo) = 0{N~'^). This is also 
sufficient for pointwise risk as, for any estimator C/, we 
have Rb{U) < Rp{U). Moreover, as Eu[A{U,Uo)] does 
not depend on U, Rt!-{Uo) = Rb(Uo). It is then sufficient 
to prove, for a tt of our choice, that: 



RAUo) = OiN'^). 



(3.1) 



The idea is to find a Cramer-Rao bound that we can 
apply to some tt. We shall combine the Braunstein and 
Caves information inequality l|3.3p and the Van Trees in- 
equality p.2p to obtain the desired Quantum Cramer- 
Rao Bound, much in the spirit of This bound will 
yield an expHcit rate through a result of [l^ . 

Van Trees' inequality states that given a classical sta- 
tistical model smoothly parameterized hy E & C MP, 
and a smooth prior with compact support 8o C O, then 
for any estimator 9, we have: 



E,[Tr(Fe(0))] > 



P 



E.[Tr(/(0))] 



(3.2) 



where I{9) is the Fisher information matrix of the model 
at point 0, is a finite (for reasonable tt) constant de- 
pending on TT (quantifying in some way the prior infor- 
mation), and Vg{9) G Mp(M) is the mean square error 
(MSE) of the estimator 9 at point 9 given by: 

Vo{9)^,(}^E[{9^~9^)i9,3~9p)]. 

This form of Van Trees inequality is obtained by setting 
N = 1, G ^ C = Id and ip = 9 in (12) of 

Now the Braunstein and Caves information inequal- 
ity yields an upper bound on the information ma- 
trix Im (9) of any classical statistical model obtained by 
applying the measurement M to a quantum statistical 
model. For any family of quantum states parameterized 
by a p-dimensional parameter 9 £ Q EMf, for any mea- 
surement M on these states, the following holds: 



Im{9)<H{9), 



(3.3) 



where H{9) is the quantum Fisher information informa- 
tion matrix at point 9. 

Now it was proved in [l3| that for a smooth parameter- 
ization of an open set of SU{d), and for any input state, 
the quantum Fisher information of the output states ful- 
fils: 

H{9) = 0{N^). 



Inserting in l|3.2|) together with (|3.3p we get as quantum 
Cramer-Rao bound 



E^[Tv{Ve{9))]^0 



1 



(3.4) 



We now want to apply this bound to obtain l|3.ip . 
There are a few small technical difficulties. First of all, 



we cannot use the uniform prior for tt as SU{d) is not 
homeomorphic to an open set of R''. We then have to 
define two neighborhoods of the identity 6o C 0, allow- 
ing to use the Van Trees inequality. Now our estimator 
Uo need not be in 8, so that we shall in fact apply Van 
Trees inequality to a modified estimator U . Finally, this 
bound is on the variance, and we must relate it to A. 

Our first task consists in restricting our attention to 
a neighborhood O of led . It corresponds to a neighbor- 
hood (we use the same notation) of € through 
U = cxp(^^ 6'cTc). This holds if the neighborhood is 
small enough, so we define \i hy U S 9 if and only if 
A(lcd, U) < e for a fixed small enough e. We define Qq 
through U G Qo for A^l^d, U) < e/3, and take a smooth 
fixed prior tt with support in 8o, such that Xtt < oo. 

Now we modify our estimator Uo into an estimator U 
given by IJ = Uo for ?7o £ 6 and IJ = Ic^ for Uo ^ 0. 
Then, by the triangle inequality, for any U € 8o, we have 
A{U,Uo)>MU,U). 

The fundamental point of the reasoning (used at l|3.5p ) 
is that, as A is quadratic at the first-order, there is a 
positive constant c such that, for any U^, U"^ £ 6, corre- 
sponding to 9^,9'', wehave A([/i,t/2) > cYo.{9i~9lf. 

Finally we get 



R-,{Uo) = E^[Eu[A{U,Uo)\] 
>E^[E[/[A(C/,i/)]] 

= 0{N-^). 



(3.5) 



We have thus proved l|3.ip . and hence our bound on 
the efficiency of any estimator. 

We now write formulas for the risk of any estimator of 
the form given in Theorem lII.il 



IV. FORMULAS FOR THE RISK 

By l|2.5p . our risk Rp{U) is equal to the pointwise risk 
at led , with which we shall work: 



SU[d) 



Now we compute the probability distribution of U for 
a given j^") of the form l|2.2p . that is 



PiAU)^mU^U*\^) 



•D(A) 



X:\X\=N ^^^> 



A:lA|=Ar 



1=1 

2 
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where we have used that the character Xx of A is the trace 
of U in the representation. 

Then, using (|4.ip . recalhng that pi^^ is a probabihty 
density for Haar measure /i on SU{d), and that XxiXp = 
Xxi(g,x2 (for the second term), we get: 



Rp{U) = 1 



d2 



SU{d) 



A:|A|=JV 



dii{U). 
(4.2) 



In order to evaluate the second term, we use the fol- 
lowing orthogonality relations for characters: 



Mu)xxSu)xx.Sur-h,^^^. 



(4.3) 



ISU{d) 

To do so we need the Clebsch-Gordan series of A ® □: 

A®n = ®{i<,<d|A,>A.+i}A + ei, (4.4) 

where conventionally A^+i = 0. Here we see A as a d- 
dimensional vector and as the i-th basis vector. 
We then reorganize the sum of characters as: 

E <>^)^mu(fi)= E E c(A'-e.)Xv(f7), 

A:|A|=W X':\\'\=N+lieS(\') 

where S{\') is the set of i between 1 and d such that 
A' — Ci is still a representation, that is A,^ > Xi^i- We 
shall write 4I^S{\') for its cardinality. 

Inserting in (|4.2p and remembering l|4.3p . we are left 
with 



Rp{U) 



^ _ J2x':\X'\=N+l\^iGS{\')'^('^' 



d2 



(4.5) 



To go any further, we must work with specific c(A). 



We must try to get the fraction in l|4.5p close to one. 
Now 

J2x':\X'\=N+l I SieS(A') ^(A' ~ 



d2 



< 



E 



#SiX')E^esi^')\c{y ~e,)\^ 



\':\\'\=N+1 

^ ^ E.s5(Aol^(^'-e»)P 
- ^ d 

X':\X'\=N+1 
A:|A|=JV 

The first inequality was obtained using Cauchy-Schwarz 
inequality for each inner sum. There is equality if c(A' — 
Ci) does not depend on i. From this, we deduce that 
for most A', the c(A' — e^) must be approximately equal, 
especially if they are large. The second inequality follows 
from #5(A') < d. From this we deduce that for A 
Vn+1, the coefficients c(A — e^) must be small. Remark 
that about of the A' such that |A'| = N + 1 are 

not in Vn+1, so that if all c(A) were equal, these border 
terms would cause our rate to be The key of the 

third inequality is to notice that each c(A) is appearing in 
the sum once for each term in its Clebsch-Gordan series 
(|4.4p . and that there are at most d terms. Please note 
that there are d terms if A G Pn, and if A' is in Vn+i, 
far from the border, then A' — is in Vn, far from the 
border. 

The conclusion of these heuristics is that we must 
choose coefficients "locally" approximately equal (at most 
variation in ratio), and that the coeflflcients must go 
to when we are approaching the border of Vn ■ 

One weight satisfying these heuristics is the following. 



3(A) = A/- Hp. 



(5.1) 



V. CHOICE OF THE COEFFICIENTS c(A) AND 
PROOF OF THEIR EFFICIENCY 



We now have to choose the coefficients c(A) so that the 
right-hand side of l|4.5p is small. 

It appears useful to introduce subsets of the set of all 
irreducible representations. Let Vn = {A| |A| = iV; Ai > 
• • • > Ad > 0}. Obviously, if A' e Vn+i, then #5(A') = 
d, and the converse is true. We can see them intuitively 
as points on a (d — l)-dimensional surface, and with this 
picture in mind, we shall speak of the border of "Pat (when 
Ai = Aj+i + 1 for some z), or of being far from the border 
(without precise mathematical meaning). 

We are ready to give heuristic arguments on how good 
coefficients should behave. 



where A/" is a normaHzation constant to ensure that (|2.3p 
is satisfied and pi = Ai — Ai+i. We shall use it below, and 
prove that it delivers the l/N"^ rate. 

A first remark about these weights is that c(A) = if 
X^Vn- Now, for any A G T'w, we have V{X) > M{X), 
so that we do not need an ancilla. 

Indeed, using hook formulas (see [l3|), we get 
MiX)/ViX) = mUti ^75^- Now for A G Vn, we 
know that A^ ^ 0. Under this constraint and ^ Ai = iV, 
the maximum is attained by Ai = N — d + 1 and Ai = 1 
for i ^ 1. We end up with exactly 1. 



We shall now use l|5.ip and express the numerator of 
(|4.5p with our choice of pi . Notice first that if pj charac- 
terize A' then those which characterize A' — ei are given 
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by p'.'^ = pj + - So 



with 



A':|A'|=Ar+l ie5(A') = 1 



with 



Introducing another notation will make this slightly more 
compact. For a vector x with d components and £ a 
subset of {1, . . . , d}, define: 



= n 

3it£ 



(5.2) 



Then 



''a'(*) = +'^J>1 --P{i.J-l}) • 



Notice now that for A S "Pjv, there are exactly d irre- 
ducible representations appearing in the Clebsch-Gordan 
decomposition of A i^i □ l|4.4p . So that c(A)^ appears ex- 
actly d times in ■.\x'\=N+iT.ies(\')^^^' ~ ^')^ ■ 
may then rewrite the renormalization constant N as 

E E ri^f^- 

A':|A'| = 7V+lie5(A')J = l 



values of c(A), we aim at proving: 

Sa':|A'| = 7V+1 (SiGS(A') nj = l?'j +^A'(*)) 

Let us expand the numerator: 



E E Hp^ + ^X'^I =a(i+ti + t2), 

A':|A'|=JV+1 \iG5(A')J = l 



a=^d#5(A')n^'^ 

A' J = l 



Ml 



W2 



2rfEv E.s5(AO^A'(0nj-=lP3 
Cu 



With these notations, we aim at proving the set of 
estimates given in Lemma [VTTl Indeed they imply: 



'}2x':\X'\=N+l (Sie5(A') nj=l +^A'(* ^ 

= 1 + t2 - U2 + 0(Af"^) 

with (^2 - U2) of order TV^^ gy ,[53])^ ^j^g j-^gj^ ^jjg 
estimator is then M2 — t2 + 0(-/V^'^). Thus proving Lemma 
IV. II amounts at proving l/N"^ rate. 

We shall make use of the notation 0(/), meaning that 
there are universal positive constants m and M such that: 

m/ < e(/) < Mf. 



with our 


Lemma V.l. 


With the above notations, 






-ct-d- E {\{p^ 


1+0{N-'^). 




A':|A'| = 7V+1 \i=l 






: e(iv3^-i) 


(5.3) 


ti = 






t2 = 


: 0(iV-2) 




U2 = 


= 0{N-^). 



with 



c, = Y,{m>^')fX{pl 

X' J = i 

, 2EvE..g5(AO#'g(A0r■,^,(^)^,tlP.- 
O ■ 

Ea' (Eje5(A') ^A'(*' 

Ct 



to - 



Similarly the denominator can be read as: 



Proof. We first prove the first line. 

Indeed for A' S 'Pn+i, all i are in S{X'), 

and (E,65(A')nj=lPj) = rfE,65(A')njtiP' = 

d^Ylj^iP'j. But if A' ^ 'Pw+i, there is at least one pj 
equal to zero, so they do not contribute to the sum. So 

that C„ = Ct = d^- Ea':|A'|=7v+i (HjIiPj) • 

We have then equality of the denominators of ti and 
ui. The same argument gives equality of the numerators. 
On Vn+i, #S{X') =dso that 



^ #S{X')r^,,{^)Y[p,=d r^x'(^)IlP3^ 

•je5(A') J=i ie5(A') J=i 
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and outside "Pjv+i, Ilj^i-Pj = so that the equahty still 
holds. Therefore ti — ui. 

Now Pj < N +1 so that U^iPj ^ + l)"^ and 
\r^,{i)\ < 2{N + ly-'^. Moreover, as 1 < A, < + 1 
and Ad is known if the other Ai are known, the number 
of elements A' in Vn+i satisfies ^ffVN+i < (N + 1)^^^. 
Thus the numerator of and ui is 0{N^'^~'^) and that of 
t2 and U2 is 0{N^'^~^). To end the proof of the lemma, 
it is then sufficient to show that C„ = @{N^'^~^). 

Let us write N + 1 = a{l + d{d + l))/2 + b with a and 
b natural integers and 5 < (1 + d{d+ 1)). We then select 
hi for i = 1 to d such that '^hi ~ a/2. The number 
of ways of partitioning a/2 in d parts is ("^'^^'J"^), and 
this is Q{a'^~^) — Q{N'^~^). To each of these partitions, 
we associate a different A' in Vn+i through Ai = (c? — 
i + l)a + Si^ib + hi. For each of these A', we have pj = 
Aj - Aj+i > a/2, so that UUiP^j = ^i^^'^)- We may 
lower bound Cu by the sum over these A' of 11^=1 so 
that we have proved C„ = Q{N^''-^). □ 



VI. EVALUATION OF THE CONSTANT IN 
THE SPEED OF CONVERGENCE AND FINAL 
RESULT 

The strategy we study is asymptotically optimal up 
to a constant, but a better constant can probably be 
obtained. Anything like c(A) = (JIPj)" with a > 1/2 
should yield the same rate, though it would be more cum- 
bersome to prove. Polynomials in the pj could also bring 
some improvement. All the same we give in this section 
a quick evaluation of the constant, that may serve as a 
benchmark for more precise strategies. 

Write Pj = {N + l)xj. Then, recaUing our notation 
El 



Now Sn+1 is the intersection S of the lattice in [0, 1]'' 
with mesh size 1/{N + 1) with the hyperplane given 
by the equation J^i^ — 3 + ~ 1. Therefore the 

points of Sn+i are a regular paving of a flat {d — 1)- 
dimensional volume, with more and more points (we 
know that #5^+1 = 0{N''~'^)). Therefore both denomi- 
nator and numerator of l|6.1|) are Riemannian sums with 
respect to the Lebesgue measure, with a multiplicative 
constant that is the same for both. Therefore we have 
proved: 



Theorem VI. 1. The estimator U corresponding to (5.1]) 
has the following risk: 

Rb{U) = Rp{U) = Ei^, [a(1c<^, ;/)] = C7V-2+o(Ar-3) 
where C is the fraction 

Up to a multiplicative constant, this risk is asymptotically 
optimal, both for a Bayes uniform prior and for global 
pointwise estimation. 

Numerical estimation, up to two digits, for the low 
dimensions yields: 

10 for d = 2 
75 for d = 3 
2.7 X 10^ for d = 4. 



rj^, (t) = {N + if-^ + ,5,;>ia;{,_i} + 0{N-')) . 

Similarly, the set of allowed x = {xi, . . . ,Xn) may be 
described as 



Sn+1 = { X I x,{N + 1) e N; 2j(c« - J + = 1 

i=i 



We may then rewrite: 



U2 



2 



Subtracting, we obtain (the first sums being on Sn+i) 
U2-t2 + 0{N-^)^ (6.1) 

E£2d(sti(2;w)^ -Et2 2^{»}^{»-1}) - id+\){x{a}f 



d'T.,Y{U- 



VII. CONCLUSION 

We have given a strategy for estimating an unknown 
unitary channel U S SU{d), and proved that the con- 
vergence rate of this strategy is 1/N'^. We have further 
proved that this rate is optimal, even if the constant may 
be improved. 

The interest of this result lies in that such rates are 
much faster than the 1/N achieved in classical estimation 
and, though they had already been obtained for SU{2), 
they were never before shown to hold for general SU{d). 
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